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Abstract. The Quantum Kolkata restaurant problem is a multiple-choice version of the quantum 
minority game, where a set of n non-communicating players have to chose between one of m 
choices. A payoff is granted to the players that make a tinique choice. It has previously been shown 
that shared entanglement and quantum operations can aid the players to coordinate their actions 
and acquire higher payoffs than is possible with classical randomization. In this paper the initial 
quantum state is expanded to a family of GHZ-type states and strategies are discussed in terms of 
possible final outcomes. It is shown that the players individually seek outcomes that maximize the 
collective good. 
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INTRODUCTION 
Games and framework 

Game theory is the study of systematic and strategic decision-making in interactive 
situations of conflict and cooperation. The models are widely used in economics, po- 
litical science, biology and computer science to capture the behavior of individual par- 
ticipants in terms of responses to strategies of the rest. The field attempts to describe 
how decision makers do and should interact within a well-defined system of rules to 
maximize their satisfaction with the outcome [1, 2, 3]. A game is a model of the strate- 
gies of these decision makers or players as we will call them, in terms of choices 
made by each of them. They are assumed to all have individual preference profiles 
o^i b CTjci ^ ■ ■ ■ ^ over a set of m outcomes {cTjc^}, where ' ^ ' should be inter- 
preted as "preferred by", and the x/s as indices for possible outcomes. An outcome or 
strategy profile C7 e x Sn-\ x • • • x = 5 is equivalent to the combination of the 
strategies s^j G Si of the participants, where 5^ is the j'th strategy of player i, St the set of 
strategies or choices available to that player and S the set of all possible strategy profiles. 
In order to evaluate the profit or satisfaction of player / with regards to a strategy profile 
we need to define for each player a payoff function $; that takes a strategy profile <7 as 
input and outputs a real numerical value as a measure of desirability. We have $ : S — )■ M 
and %i{Ok) > $i{<yi) Ok hi <7/. The question to answer is, what should rational players 
choose to do given that they have partial or complete information of the content of S 
and the payoff functions $,? The main approach is to find a solution concept, with the 



most famous one being the Nash Equilibrium, where all players simply make the choice 
5y that is the best possible response to any configuration in S/Si, i.e to any combination 
of strategies by their counter-parties. In situations where such equilibrium does not ex- 
ist one needs to extend the game to allow for mixed (probabilistic) strategies where the 
players extend the sets Si to A(S/), i.e the set of convex combinations of the .y^'s to ac- 
quire one. Just as classical probability distributions extend pure strategy games to mixed 
ones, quantum probabilities, operations and entanglement can extend the framework to 
outperform any classical setup. 



Quantum games 

A quantum game is defined by a set F of objects and the relationships between them: 

r= {Pi2,^,n,5,,$,} for/=l,---,n (1) 

where is the Hilbert space of the composite quantum system, is the initial 
state of the game defined on M'^, n is the number of players. Si the set of available 
strategies of player i and $(• the payoffs available to player / for each game outcome. In 
our quantum game protocol the m; different pure strategies available to a player / will be 
encoded in the basis states of an m,-level quantum system p ^. G . With n players 
we'll end up needing a initial quantum state p^ G — ® '^^n-i "'® '^^i 
dim{J^f^) = n"=i dim(.^fg.) to accommodate for all possible game outcomes [4]. 

The strategies are chosen and played by each player trough the application of a unitary 
operator Ui E Si = S{mi) on their own sub-systems, where the set of allowed quantum 
operations S{mi) is some subset of the special unitary group SU(m,). The general 
procedure of a quantum game consists of a transformation of the composite initial 
state trough local unitary operations by the players: ?7„ ® Un-\ ®---®U\ : M'^ — > J^. 
Followed by a measurement outcome, or in terms of pre-measurement reasoning, an 
expectation value: $ : M. 

QUANTUM KOLKATA RESTAURANT PROBLEM 

This is a general form of a minority game [5, 6, 7], where n non-communicating agents 
(players), have to choose between m choices. A payoff of $ = 1 is payed out to the 
players that make unique choices. Players making the same choice receive $ = 0. The 
challenge is to come up with a strategy profile that maximizes the expected payoffs Ei{%) 
of all players /, and has the property of being a Nash equilibrium. In the absence of 
communication, in a classical framework, there is nothing else to do, but to randomize. 



Collective aim in the quantum case 

It has been shown for the case of three players and three choices in the quantum 
setting, starting with a GHZ-type state \\ifin) = 4^ (|000) |111) |222)), that shared 



entanglement and local SU(3) operations (by the players on their own subsystems) will 
lead to a expected payoff E{$) = |. This is a 50% increase compared to the classical 
payoff of I reachable trough randomization. Although the details of the protocol can 
be found in [8], it is instructive to jump back a couple steps. Since we have three 
players with three allowed pure choices the Hilbert space we are dealing with is the 
space of three qutrit states; with a basis B = {\ijk)}; i,j^k E {0, 1,2}, each of which 
representing a post-measurement outcome of the game, where 7, k denotes the final 
choices of players 1,2,3 respectively. We have span(5) = {T4 j k=o'^ijk\^j^) '• ~ 
0, 1,2 dMdaijk e C} which with a normalization condition gives us the complete Hilbert 
space of the game. We can divide B into subsets that are interesting from the point of 
view of the possible outcomes: 



L 


= {|000),|111), 


|222)}, 


(2) 


G 


= {|012),|120), 


|201),|021),|102),|210)}, 


(3) 


Di 


- {|011),|022), 


|100),|122),|200),|211)}, 


(4) 


D2 


= {|101),|202), 


1010), |212), 1020), |121)}, 


(5) 


D3 


= {|110),|220), 


1001), |221), 1002), |112)}, 


(6) 



where L contains all states for which none the players 1,2,3 receive any payoff. It is 
thus a collective objective to avoid these states. G contains all those states that returns a 
payoff $ = 1 to the three of them and the sets Di contains the post-measurement states 
leads to a payoff $ = 1 for player / and $ = to players 7^ /. Thus the general goal of 
each player i is to maximize the probability of the post-measurement outcome to be a 
state in G, = GUZ),. Starting with an initial state | !//,„), each player z G 1,2,3 applies an 
operator from its set of allowed strategies Si C SU(3), transforming it to its final state 
\Wfin) = Ui®U2® t/slv^jn)- The expected payoff £■($;) of player / is the probability of 
the post-measurement outcome to be a state in G,: 

mi)= E l(v^/«ioi'- (V) 

Given that we have an initial state in span(L) containing all states of the form | !//;„) = 

a|000) + j8 1 1 1 1) + 7|222) with a, j8 , 7 G C (We will assume < a, j8 , 7 G R later for 
simplicity), what is the rational aim of player il First, note that all states in span(L) are 
unbiased with regards of change in player positions. We can assume that they don't even 
know which qutrit they control since that knowledge doesn't add any useful information 
for the choice of Ui G Si. Second, any choice of Ui aimed at increasing the probability 
of post-measurement state to end up being in D,- must due to the symmetry of the setup 
increase the probability of the state being in Dj^iyj^ DDj^^iyj with 2: 1. Third, an outcome 
in G = Gi n G2 n G3 is as favorable as any outcome in Z); for player /. It follows therefore 
that the players should aim for producing a state in span(G) to the extent this is possible. 
Although it was shown in [8] that they fail to fully depart from span(L), whereby they 
reach a maximum payoff of £■($) = ! rater than £($) = 1 for a = j8 = 7 = 4=, with a 



FIGURE 1. Payoffs associated with optimal strategies in a three player game with a variable initial 
state. The expected payoff £($) decreases as the level of entanglement decreases. 



final state of the following form: 

I Wfm) = I (|000) + |012) + |021) + 1 102) + 

|111) + |120) + |201) + |210) + |222)) . (8) 



Expected payoffs for initial states in span(L) 

Due to the symmetries mentioned in the previous section the three players will have 
to chose a unitary operator U that takes states from span(L) to span(LU G), without the 
possibility to favor any subset of G (even if that possibility existed, that would only put 
a roof for the individual payoffs and any choice of subset other than the whole would 
decrease the expected payoff, due to the lack of coordination the choice of subset). This 
leads to the conclusion that there exists a U (fixed up to a global phase) for any initial 
state 1 1///„) = a|000) + j8 1 1 1 1) + 7|222), that maximizes the individual expected payoffs 
and is a Nash equilibrium solution, since any departure from this strategy will lead to a 
lower payoff. We have: 



E,naAS) = £ (V//,-„|[/'^®f/^®C/^|<^) 



(9) 



simultaneously for all three of them. Figure 1 shows numerically calculated expected 
payoffs £■($) for a = sin cos (p; /3 = sintJsin(p; 7= cost? where (p = ^M,t3- = 
and M,N = 1,2, ■■ ■ ,20. A total of 400 optimizations with equally many different as- 
sociated operators Umn- We see that the expected payoff is maximized for (p = | and 
= cos^^ where the initial state is maximally entangled, and falls off towards the 
classical expected payoff as the entanglement decreases. 



CONCLUSIONS 

The ambition of self-maximization in the studied Kolkata restaurant problem leads 
individual players to act in such way that the collective good is maximized. The game is 
symmetric with regards to permutations of player positions which guides the participants 
to aim for a set of outcomes that favors them all. The expected payoff reachable trough 
local operations changes with the level of entanglement in the initial state. 
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